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I.  INTRODUCTION  and  review  of  relevant  literature 


a.  introduction 

This  thesis  is  an  investigation  of  nethods  for 
predicting  the  rate  of  reenlistment  in  the  armed  forces, 
specifically  the  Navy.  Since  the  advent  of  the  all  volunteer 
force  in  1973,  one  of  the  major  concerns  of  the  military  has 
teen  the  retention  of  qualified  personnel  beyond  their  first 
enlistment.  Referred  to  as  first  term  reenlist  a ent s,  this 
decision  has  been  the  object  of  extensive  study  and  modeling 
by  each  of  the  services.  The  vast  majority  of  tnis  work  has 
centered  around  the  formulation  of  causal  models  with  a 
heavy  emphasis  on  economic  factors.  During  periods  when 
reenlistments  have  been  below  the  required  levels,  these 
models  have  been  quite  good  at  capturing  the  effect  of  the 
economic  factors  used  to  suggest  the  level  of  monetary 
compensation.  Over  the  past  four  years,  this  situation  has 
changed  to  the  point  where  the  services  are  faced  with  such 
large  numbers  of  personnel  desiring  to  remain  in  the 
services,  that  high  reenlistment  is  actually  lowering  the 
numners  of  personnel  that  can  be  enlisted  for  some  ratings 
at  the  recruiting  stations.  As  always,  there  is  still  a  need 
for  more  personnel  in  the  nuclear  related  ratings  but  seme 
of  the  less  technical  fields  are  approaching  an  end  point 
where  only  a  limited  number  of  billets  may  be  available  to 
new  accessions. 

The  object  of  this  thesis  is  to  attempt  to  construct  a 
short  term  model  that  could  aid  in  predicting  first  term 
reenlistments  for  five  selected  Navy  ratings.  Initial  models 
will  be  developed  utilizing  the  Box-Jenkins  method  of  time 
series  analysis.  These  initial  models  will  be  used  to 


B.  EXAMINATIDN  OF  AUTOCORRELATIONS  AND  PARTIAL 

AUTOCORRELATIONS 

Autocorrelations  describes  the  association  between 
values  of  the  same  variable  but  at  different  time  periods. 
Autocorrelation  coefficients  provide  important  information 
about  the  structure  of  a  time  series.  These  coefficients 
can  be  used  to  identify  trends  and  possible  seasonality 
within  the  data.  [Ref.  4] 

Partial  autocorrelation  is  used  to  identify  the  extent 
of  the  relationship  between  current  values  of  a  variable 
with  earlier  values  of  that  same  variable,  while  holding  the 
effects  of  all  other  time  lags  constant.  [Ref.  I  ] 

1  .  Yeoman 

Examination  cf  the  autocorrelations  for  the  Yeoman 
rating.  Fig.  2.6,  suggests  that  the  data  is  stitionary  and 
should  not  reguire  any  transformation  prior  to  model  anal¬ 
ysis.  The  residuals  are  within  two  standard  errors  of  the 
mean  zero  and  appear  to  be  randomly  distributed.  The 
partial  autocorrelations.  Fig.  2.7,  decay  to  zero  rapidly 
and  appear  random  after  this  point  suggesting  ti at  an  auto¬ 
regressive  model  may  be  appropriate. 

2  •  St  orek.ee  per 

Examination  cf  the  autocorrelations  for  this  data 
set  indicated  the  residuals  met  the  randomness  criteria  of 
two  standard  errors.  The  shape  of  the  residuals  appeared  to 
be  a  decaying  sine  curve  Fig  2.8  The  partial  autocorrela¬ 
tions  showed  no  shape  but  dropped  to  zero  gradually  and  were 
randomly  distributed.  Fig.  2.9,  not  suggesting  aay  clear  cut 
model . 
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The  data  consisted  of  monthly  summaries  of  first  term 
reenlistment  percentages  for  the  subject  ratings  covering 
the  period  from  October,  1980  through  September,  1983,  these 
data  were  provided  by  Mr.  James  McEwan,  the  statistician  for 
the  Ee-enlistmen t  Programs  Development  Office  (OP-136),  a 
summary  of  the  data  is  shown  in  table  I.  \ eenlistment 
percentages  for  these  ratings  ranged  from  a  low  of  41.2  per 
month  (BT)  ,  to  a  high  of  9  1.2  per  month  (ET).1  It  should  be 
noted  here  that  the  time  series  plots  of  the  raw  data 
neither  suggest  any  clear  cut  trends  or  seasonality,  nor  are 
the  series  similar  across  the  ratings. 

The  time  series  plots  on  the  following  pagas  represent 
the  percentages  of  reenlistments  in  each  of  the  selected 
ratings,  the  vertical  axis  is  the  reenlistment  percentage 
and  the  horizontal  axis  is  the  time  line.  The  origin  for 

the  time  line  is  October,  1980  and  the  eni  point  is 
September,  1933. 


TABLE  I 

S DM9 AS  Y  STATISTICS  OF  fl EENLISTMENT  HATES 


EATS 

DATES 

MEAN 

ST.  DE  V 

YN 

10/80-8/83 

55.5 

1  1.9 

SK 

10/80-9/83 

49.5 

13.2 

OS 

10/8  0-  9/83 

4  3.  6 

18.0 

ET 

10/8  0-9/83 

9  1.4 

0  6.5 

BT 

10/80-9/83 

4  1.2 

14.7 

.  lThese  numbers  are  to  be  viewed  in  a  relati/e  sense  for 
their  ability  to  provide  sufficient  population  size  for 
their  respective  models  and  not  as  any  measurement  of 
"health"  in  a  particular  rating.  This  is  to  say  that  these 
percentages  do  not  necessarily  measure  need  in  a  particular 
rating,  but  rather  the  percentage  of  eligibles  reenlisting 
In  addition,  time  series  plots  of  each  data  set  are 
presented  in  Fig.  2.1  through  Fig.  2.5 


II.  APPLICATION  OF  EOX- JEN  KINS  ANALYSIS  TO  THE  JNIVAHIATE 

MODELS 

This  chapter  presents  the  analysis  of  re-enli.  s  tmen  t  data 
by  the  Box-  Jenkins  technique.  Box-Jenkins  analysis  involves 
three  steps; 

1.  Identification  -  This  involves  analysis  )£  the  time 
series  plots  of  the  raw  data  to  try  and  discern  any 
obvious  trend  or  seasonality 

2.  Estimate  -  This  step  involves  analysis  oc  the  auto¬ 
correlations  and  partial  autocorrelations  to  provide 
an  estimate  for  an  initial  model 

3.  Forecast  -  This  step  involves  running  the  models  and 
generating  predictions  which  are  then  evaluated  for 
their  adequacy. 

Should  any  model  prove  inadequate,  the  estimation  of  the 
model  is  re-evaluated  to  find  a  more  suitable  model. 

A.  INITIAL  ANALYSIS  OF  THE  DATA  SETS 

The  Box-Jenkins  method  was  applied  to  data  sets  of  the 
number  of  first  term  reenlistees  for  the  followiig  ratings: 

1.  Yeoman  (YN) 

2.  Storekeeper  (SK) 

3.  Operations  Specialist  (OS) 

4.  Electronics  Technician  (ET) 

5.  Boiler  Technician  { B  T) 

These  ratings  were  selected  foe  analysis  oecause  they 
presented  a  representative  mix  in  mental  category  groupings, 
varying  degrees  of  general  and  specific  trainiog  and  also 
provided  sufficient  numbers  of  reenlistments  •  to  perform 


4. 


Be  pko*  s  Regression  Model 


Bepko  [Ref.  1],  concentrated  his  efforts  in  developing  a 
multiple  regression  model  for  forecasting  career  retention 
fceyond  the  first  enlistment.  This  model  was  unique  in  that 
it  evaluated  the  Navy  Rating  structure  by  occupational 
groupings  and  utilized  an  age  specific  index  for  introducing 
unemployment  into  the  model.  This  was  the  first  work  to  look 
at  ratings  and  unemployment  together  in  tnis  manner. 

Bepko*  s  [Ref.  1],  overall  findings  were  that  evalu¬ 
ating  the  reenlistment  decision  by  groupings  i mong  caree¬ 
rists  yielded  more  relevant  models  for  the  application  of 
bonus  payments  and  that  unemployment  among  the  25-39  age 
group  was  a  very  significant  factor  in  the  reenlistment 
decision  of  careerists. 

5.  The  Thomas-Liao  Model  of  Careerists 

The  work  of  Thomas  and  Liao  [Ref.  3],  continued 
along  the  same  track  as  Bepko  [Ref.  1  ],  in  the  examination 
of  the  reenlistment  rate  among  careerists,  or  tiose  consid¬ 
ering  their  second  or  subsequent  reenlistment.  The  differ¬ 
ence  in  this  model  is  the  unique  grouping  co nsideration 
given  to  the  eating  structure,  where  the  ratings  are  aggre¬ 
gated  by  the  patterns  of  their  past  reenlistment  percent¬ 
ages.  The  effect  of  this  appears  to  better 

capture  the  effect  of  the  significant  /ariables  on 
the  reenlistment  decision.  The  variables  utilized  in  this 
study  were  national  unemployment,  the  civilian  military  pay 
ratio  and  tenure  as  expressed  by  years  of  service.  The 
results  cf  the  predictions  generated  by  the  regression  equa¬ 
tion  were  excellent  and  generally  less  than  13$  in  total 
err  or. 
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2.  <j)  =  The  autoregressive  coefficient  that  describes  the 
model 

3.  9  =  The  moving  average  coefficient  that  describes  the 
model 

4.  e  =  The  error  term  of  the  model. 

These  two  models  were  initially  used  to  generate 
independent  forecasts  of  the  actual  enlistment  supply.  Both 
models  were  adequate  in  capturing  trends  in  the  actual 
enlistment  rate  but  were  somewhat  deficient  in  the  actual 
numbers  generated.  At  this  point.  Darling  combined  the  tech¬ 
niques  by  utilizing  the  Box-Jenkins  method  to  take  advantage 
of  the  high  degree  of  serial  correlation  remaining  in  his 
regression  models  residuals  by  modeling  the  residuals  as  a 
separate  time  series.  The  result  of  this  method  was  to 
predict  ’error*  terms  to  be  applied  to  the  results  of  the 
regression  equation.  This  combined  model  was  summarized  by 
.eqn  1.4  shown  below; 

LSVc  =  LSVmr  ♦  Zt*  (eqn  1.4) 

where  ; 

1.  LSVc  =  The  predicted  value  of  the  combine!  model 

2.  LSVmr  =  The  predicted  value  using  the  regression 
model  only 

3.  Zt’  =  The  Box-Jenkins  model  of  the  resiiuals  of  the 
regression  model 

The  resulting  combined  forecasts  were  substantially 
better  than  either  of  the  techniques  could  acuieve  sepa¬ 
rately  and  was  extremely  accurate  in  capturing  the  general 
trend  of  the  enlistment  supply. 
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the  first  was  a  standard  multiple  regression  of  the  logit 
form  : 

-cl(  M-rto) 

S  (M)  =  o.  Mo  /[  L  Ho  +(a-  k  M0)e  ]  (eqn  1.2) 

wher  e ; 

1.  S  =  The  supply  of  military  recruits 

2.  M  =  The  military  wage 

3.  a  =  The  stochastic  error  term 

In  this  phase  of  development,  the  following  variables  were 
introduced : 

1.  Civilian  military  pay  ratio 

2.  National  unemployment  for  16-19  year  old  males 

3.  Monthly  leads  from  print  media 

4.  Number  of  Marine  recruiters 

5.  Dummy  variable  to  account  for  anomalies  in  a  partic¬ 
ular  time  period. 

Further  analysis  led  to  inclusion  of  items  one,  two  and  five 
from  the  above  list  in  the  final  regression  modeL. 

Next,  a  totally  independent  model  based  on 
Box-Jenkins  methodology  was  developed  for  Marine  accessions. 
This  was  a  univariate  model  whose  purpose  was  to  capture  the 
effect  of  seasonality  that  was  missed  by  the  regression 
model.  The  Box-Jenkins  method  yielded  a  model  of:  the  form: 

■  tyt-rfy  h-v  *v«+e.+et-0/et.(i  (e«"  1-3> 

where ; 

1.  Y  =  The  number  of  high  school  graduates  in  mental 
categories  I  and  II  that  enlist  in  the  Macine  Corp  in 
month  t. 


2.  M  ( j)  =  Monetary  returns  to  military  service  from 
period  '  t*  through  ' n* 

3.  K  (n)  =  Lump  sum  payment  of  the  present  value  of  the 

expected  post  service  civilian  wages  realized  by 
those  staying  in  the  military  until  time  'n*. 

4.  R  (n)  =  Lump  sum  payment  of  the  present  /  alae  of  the 

expected  retirement  benefits  realized  by  those 
staying  in  the  military  until  *  a* 

5.  W  (t)  =  Present  value  in  year  't'  of  the  expected 

civilian  wages  realized  by  those  leaving  the  military 

in  year  ' t' . 

6.  R  (t)  =  Present  value  in  year  *t*  of  the  expected 

civilian  retirement  payments  for  those  Leaving  the 

military  in  year  *t* 

7.  r  =  personal  discount  rate 

As  can  be  seen,  this  model  relies  on  development  of 
several  sub-models  relating  to  civilian  wage  structure, 
fut  ure  policy  decisions  concerning  lump  sum  boa  us  payments 
and  an  individual's  personal  discount  rate  over  time.  While 
the  results  obtained  with  this  model  have  been  superb,  the 
model  itself  is  complex  and  somewhat  difficult.  Since  the 
introduction  of  the  general  model  described  above,  it  has 
been  further  refined  to  include  a  'taste'  facta r  for  mili¬ 
tary  service  and  a  random  disturbance  term  whica  is  used  to 
capture  the  effects  of  sea  shore  rotation,  poor  duty  station 
and  family  separation.  These  factors  improve  tie  model  but 
at  a  cost  of  aver  increasing  complexity. 

3.  Darling ' s  Model  of  Marine  C  or  £  Enlistment  s 


Darling  [Ref.  2],  utilized  a  combination  of 
Eox-Jenkins  time  series  analysis  and  multiple  linear 
regression  to  predict  the  supply  of  upper  mental  category 
recruits  to  the  Marine  Corp.  The  procedure  entaiLed  develop¬ 
ment  of  three  separate  models  using  two  distinct  techniques. 


B.  REVIEW  OF  RELEVANT  LITERATURE  IN  THE  FIELD 


1 .  Overview 


To  the  best  of  the  author's  knowledge,  the  applica¬ 
tion  of  Box-Jenkins  Time  'Series  Analysis  to  reenlistment 
models  has  not  been  previously  attempted.  A  search, 
conducted  by  the  Defense  Technical  Information  Center 
(DTIC) ,  utilizing  both  title  and  subject,  did  not  reveal  any 
references  to  its  use  of  Box-Jenkins  modeling  far  reenlist¬ 
ment.  Bepko  [Ref.  1],  in  his  thesis  modeling  :  areer  petty 
officers  reviews  the  relevant  literature  concerning  first 
term  reenlistment  and  career  reenlistment  from  1974  to  the 
publication  of  his  thesis  in  1981.  This  review  will  address 
relevant  publications  since  that  time. 

2.  The  Annualized  Cost  of  Leaving  { ACOL)  Mod  el 

This  model  was  developed  for  the  Navy  by  John  Warner 
at  the  Center  for  Naval  Analysis  and  is  currently  the  most 
1  widely  used  model  in  the  Navy  with  relation  to  manpower  and 

personnel  policy  decision  making. 

The  model  itself  is  a  sophisticated  multiple  linear 
regression  that  attempts  to  capture  several  important  under- 
j  lying  forces  in  the  reenlistment  decision.  The  general  model 

is  of  the  form: 


j-t  — 
M  /  (1  +r)  ♦ 


Vt 


-(Wr+  R  i  (egn  1.1) 


wher  e ; 

1.  C  (t,n)  =  Net  present  value  of  pecuniary  and  non- 
pecuniary  returns  of  staying  in  the  miLitary  until 
time  'n'  as  compared  to  leaving  at  time  't'. 


I 


3 


predict  the  level  of  reenlistaent  for  the  selected  ratings. 
Next,  a  leading  indicator  model  will  be  developed  utilizing 
the  national  unemployment  rate  for  20-24  year  oils.  Then,  a 
refined  forecast  will  be  developed  combining  both  time 
series  and  causal  models. 

The  potential  advantage  of  time  series  analysis  lies  in 
its  simplicity  and  lack  of  reliance  on  external  factors.  To 
many,  this  is  viewed  as  a  shortcoming  since  it  cannot 
explain  causal  relationships  such  as  the  effect  of  adver¬ 
tising  dollars  on  sales  for  a  company.  However,  the  effec¬ 
tiveness  of  advertising  dollars  is  not  easy  to  determine  and 
in  many  cases,  may  lead  to  false  conclusions  when  used  in 
classical  regression  analysis.  Other  errors  such  as  the 
autocorrelation  of  an  independent  variable  or  the  multiccl- 
linearity  of  several  variables  are  also  avoided  in  the  use 
of  time  series  analysis.  Since  time  series  analysis  relies 
on  its  ability  to  reproduce  itself  over  time,  this  allows  it 
to  be  free  of  the  errors  of  regression  and  still  retain  the 
ability  to  adequately  forecast  events:  or  ia  this  case 
levels  of  reenlistment.  A  more  thorough  explanation  of  the 
Eox-Jenkins  method  is  presented  in  Appendix  B. 

If  a  time  series  model  is  accurate  at  predic:  ing  changes 
in  trends  as  well  as  levels  of  reenlistment,  then  it  may 
also  be  useful  as  a  tool  to  adjust  the  levels  of  reenlist¬ 
ment  bonuses  to  the  most  cost  effective  level  aecesarry  to 
retain  the  desired  force  levels.  If  a  model  can  accurately 
foresee  a  significant  increase  in  reenlistments,  indepen¬ 
dent  of  the  reenlistment  bonus,  the  bonus  level  for  that 
rating  can  he  scaled-down  appropriately  to  retain  personnel 
without  the  payment  of  an  economic  rent.  {Rent ,  in  this 
case,  is  the  payment  of  a  bonus  for  a  decision  independent 
of  the  bonus  award  level.)  To  summarize,  Lime  series 
analysis  may  not  prove  to  be  a  panacea  in  forecasting  reen¬ 
listment  rates  but  it  most  definitely  is  a  tool  that 
deserves  wider  consideration  and  application. 
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Figure  2.7  PARTIAL  AO  TOCOR RELATIONS  FOR  YN  RATING. 


3.  Operations  Specialist 


Analysis  of  the  aut ocorrela ti ons,  Fig.  2.10,  and 
partial  autocorrelations.  Fig.  2.11,  for  th>  Operations 
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Specialist  data  set  again  indicated  that  the  residuals  were 
randomly  distributed  and  had  characteristics  which  strongly 
suggested  the  use  of  some  type  of  autoregressive  operation 
in  model  selection. 
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4 .  Electronics  Technicians 

The  data  set  for  the  Electronics  Technicians  exhib¬ 
ited  a  strong  trend  when  the  residuals  were  evaluated,  in 


addition  there  was  some  no n-s tationa ri ty  suggested,  which 
could  require  differencing2  to  remove.  Fig.  2.12  illustrates 
the  shape  of  the  ACF  function.  Evaluation  of  tie  resultant 
autocorrelation  showed  that  this  process  may  not  be  neces¬ 
sary  as  the  first  lag  exceeds  -0.5  in  magnitude  which  is  a 
classic  indication  of  an  ov eraif f erenced  data  set. 

The  shape  of  the  autocorrelation  and  the  partial 
autocorrelation.  Fig.  2.13,  suggest  that  an  autoregressive 
or  possibly  a  mixed  model  may  be  appropriate  for  evaluation. 
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2The  method  of  differencing  converts  non- stat ionar y  time 
series  into  a  stationary  one.  It  consists  of  subtracting 
successive  values  from  one  another  and  using  ti  ere  differ¬ 
ence  as  a  new  time  series  [Ref.  4}. 
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Figure  2.13  PARTIAL  AO TOCORRELATIONS  FOR  ET  RATING. 

5 .  Boiler  Technicians 

The  Boiler  technician  rating  data  set  exhibited  no 
strong  trend  in  the  autocorrelations.  Fig.  2.14.  The  shape 
of  the  autocorrelation  and  partial  autocorrelation.  Fig. 
2.15,  also  strongly  suggested  that  an  autoregressive  type  of 
model  should  be  considered  for  evaluation.  This  was  indi¬ 
cated  by  the  decaying  sine  wave  pattern  in  the  ACF  and  the 
abrupt  cutoff  of  the  value  of  the  PACF. 


C.  MODEL  DEVELOPMENT 

In  developing  the  models,  the  autocorrelations  and 
partial  autocorrelations  were  evaluated  against  representa¬ 
tive  Box-Jenkins  models  of  the  autoregressiva  (AH)  and 
moving  average  (MA)  type.  A  best  fit  model  was  tier  selected 
for  evaluation  utilizing  the  Minitab  General  Purpose 
Statistical  Computing  System.  The  results  of  these  models 
were  then  evaluated  to  ensure  the  residuals  wera  random  and 
less  than  two  standard  errors  of  the  mean  zero.  If  the  model 
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was  satisfactory  at  this  point,  the  sum  of  squared  errors 
(SSE)  was  evaluated  to  determine  if  it  was  the  lowest 
possille  reduction  in  the  SSE.  In  the  event  that  mere  than 
one  model  passed  these  tests  the  t-ratios  were  then 
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evaluated.  Again,  if  all  these  tests  were  insignificantly 
different  among  a  family  of  models  the  principle  of  parsi¬ 
mony  was  used  to  select  the  “test"  model.  In  every  model 
developed  during  this  phase  of  analysis,  all  three  of  the 
criteria  were  met  by  cnly  the  model  selected  which  made  the 
choice  of  the  correct  prediction  model  relatively  easy  to 
select. 

Once  these  models  were  selected,  forecasts  were  gener¬ 
ated  for  the  period  October  1983  to  March  1984  and  compared 
to  the  actual  reenlistment  totals  for  the  period.  If  a  model 
was  evaluated  as  totally  inappropriate  at  this  point,  then 
further  investigation  and  modeling  was  pursued  to  attempt 
resolution  of  the  problem. 

Table  II  presents  a  summary  of  the  model  forecasts  for 
the  selected  ratings.  Model  summaries  and  statistics  are 
presented  in  appendix  C. 


TABLE 

II 

RESULTS 

OF  UNIVARIATE 

MODEL 

FORECASTS 

95T  FORECAST 

LIMITS 

PERIOD 

FORECAST 

LOWER 

UPPER 

ACTUAL 

ERROR 

YN 

RATING 

OCT 

83 

67.9 

46.3 

89.5 

60.7 

11. 

8 

NOV 

83 

61.3 

37.6 

84.9 

55.6 

10. 

1 

DEC 

83 

58.3 

34.2 

82.4 

66.0 

11. 

6 

JAN 

84 

57.0 

32.8 

8  1.1 

60.8 

17. 

2 

FEB 

34 
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32.2 
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59.1 
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84 

56.1 

3  1.9 

80.2 
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83 
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35.6 

85.2 

64.6 

06. 
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35. 1 

8  5.6 
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62.5 
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1 .  Yeoman  M odel  ] 


The  model  selected 
AR I  M  A  (1,  0,  0)  type, 

re- enlistment  percentages 
average  error  of  .106  with 
high  of  .118.  all  of  the 
confidence  limits  of  the 
shape  by  the  model.  Fig. 
forecasts  and  observations. 


for  the  Yeoman  ratinj  was  of  the 
a  comparison  of  the  forecasted 
with  the  actual  totals  showed  an 
a  range  from  a  low  of  -.172  to  a 
observations  were  within  the  95% 
model  and  were  also  captured  in 
2.16  graphically  illastrates  the 


Figure  2.16  PLOT  OF  UNIVARIATE  MODEL  FOR  YN  RATING. 
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Storekee  per  Model 


Since  the  a u tocorre lations  and  partial  i 
tions  did  not  clearly  suggest  the  selection  of 
type  as  being  superior  to  another,  an  iterativa 
used  for  selection.  After  discarding  several  firs 
MA  and  ARMA  models  it  was  decided  to  try  differ 
though  the  data  did  not  strongly  suggest  tha 


utocorrela- 
one  model 
method  was 
t  order  AR, 
encing  even 
t  this  was 


necessary.  The  resultant  AC?  and  PA2F  satisfied  the  random¬ 


ness  criteria  without  indicating  overdiff erencii 
of  the  ARI M A  (0,1,1)  order  was  then  found  to  n 
necesarry  stringency  requirements  for  model  sela 
model  generated  forecasts  with  an  average  error 
a  range  from  -.  064  to  -.194.  It  is  felt  that 
modeling  with  the  actual  data  for  the  forecast  o 


g.  A  model 
eet  all  the 
ction.  The 
of  . 132  and 
iterative 
eriod  would 


reduce  this  error  even  further, 
data  fit  with  the  model. 


2.17  illistrates  the 


3.  Operations  Specialist  Model 

Modeling  of  the  Operations  SpecialL 
resulted  in  the  selection  of  an  ARIMA  (1,0,0) 
appropriate  model.  The  data  for  the  rating  pc ' 
largest  fluctuation  in  range  over  the  entire  data 
fluctuations  have  no  doubt  influenced  tha 
predictive  power  of  the  selected  model.  As  a  c  i 
model  yielded  acceptable  predictions  varying  from 
observations  by  an  average  of  .155  and  a  range 
-.315.  As  with  the  previous  model,  successive 
with  the  new  observations  should  improve  the 
power  of  the  model.  Fig.  2.18  illustrates  tha 
with  the  actual  o  servations. 
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Figure  2.17  PLOT  OF  UNIVARIATE  MODEL  FOR  SK  RATING. 

4.  Electronics  Technician  Model 

A  ficst  order  autoregressive  model  of  the  AR IMA 
(1,0,0)  type  was  also  selected  for  the  electronics  techni¬ 
cian  rating.  The  model  produced  spectacular  results  with  an 
average  error  of  -.025  and  a  range  of  errors  from  -.003  to 
-.045.  It  should  be  noted  that  this  data  set  is  also  the 
most  stable  over  time  with  an  average  of  more  thin  915?  first 
term  reenlistments.  While  this  makes  the  model' s  job  some¬ 
what  easier,  it  still  remains  as  a  powerful  model  for 
univariate  predictions.  Fig.  2.19  illustrates  ths  fitting  of 
the  model. 
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Figure  2.18  PLOT  OF  UNIVARIATE  MODEL  FOR  OS  RATING. 

5 .  Boiler  Technician  Model 

In  the  case  of  the  boiler  technicians,  an  AR IMA 
(1,0,0)  was  again  evaluated  as  the  most  i  ppropriate, 
however;  this  model  was  the  poorest  predictor  of  any 
selected  for  evaluation.  The  average  error  was  .203  with  a 
range  from  -.328  to  .232.  The  model  also  failel  to  capture 
the  shape  of  the  actual  data  which  presented  an  a pward  trend 
while  the  model  indicated  a  downward  turn.  Tie  model  is 
however  acceptable  for  further  analysis  and  refinement  in 
the  transfer  function  model.  Fig.  2.20  illistrates  the 
fitting  of  the  data  set  predictions. 
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Figure  2.19  PLOT  OF  CJ N IVARI ATE  MODEL  FOR  ET  RATING. 


D.  SCJMHAfir 

In  this  chapter,  five  acceptable  univariate  modeLs  have  beer, 
developed  for  the  respective  data  sets.  These  models  were  in 
most  cases  fairly  obvious  from  the  analysis  of  the  autocor¬ 
relations  and  partial  autocorrelations,  however  the  process 
can  be  fairly  time  consuming  and  result  in  the  pursuit  of 
several  "blind  alleys"  on  the  way  to  a  workable  model.  In 
succeeding  chapters,  a  transfer  function  modeL  utilizing 
unemployment  statistics  for  the  20-24  age  go  up  will  be 
developed. 
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F.  THE  COMBINED  MODEL 

The  regression  models  in  the  previous  section  yielded 
adequate  forecasts  of  the  reenlistment  rate,  tney  were  not, 
however,  significantly  batter  than  the  forecasts  for  the 
univariate  models  in  chapter  three.  Additionally,  the 
residuals  of  the  regression  model  exhibited  strong  positive 
serial  correlation  as  indicated  by  the  low  values  of  the 
Dur bin- Watson  statistic  for  each  model.  Table  IV  summarizes 
the  data  for  the  regression  models.  As  shown  by  Darling 
[Bef.  2],  this  enables  the  residuals  to  be  constructed  into 
an  independent  time  series.  Through  the  application  of  the 
Box-Jenkins  method  a  forecast  of  the  residuals  or  error 
terms  can  be  generated  and  applied  to  the  regression  models 
forecasts.  This  procedure  should  yield  a  forecast  with  a 
tetter  fit  to  the  actual  data. 


TABLE  V 

REGRESSION  MODELS  FOR  SELECTED  RATINGS 


RATE 

LAG 

CONSTANT  BETA 

ST. DEV. 

R-SQRD 

DUE 3  IN- 
WATSON 

YN 

4 

9. 05 

3.32 

9.97 

.327 

1.43 

SK 

7 

1. 59 

3.57 

11.88 

.253 

1.75 

OS 

4 

73.2 

-2.08 

16.  82 

.037 

.81 

ET 

1 

73.2 

1.31 

5.53 

.181 

1.22 

3T 

5 

52.6 

-.743 

14.  93 

-.023 

.84 

The  Box-Jenkins  methodology  as  described  in  Chapter 
three  was  again  applied  to  the  sets  of  regression  model 
residuals  and  appropriate  models  were  selected.  Table  VI  is 
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TABLE  IV 

SUMMARY  OF  R-SQOARED  AND  DURBIN- VATSON  STATISTICS 


X=U  N EMPLOYMENT  FOR  20-24  YEAR  OLD  AGE  GROUP 

SELECTED  RATI NGS  E-SQUA  RED  VALUES 

LAGS  (X)  (DURBIN-WATSO  N  STATISTIC  IN  PAP.ENSl 


YN 

SK 

OS 

ET 

BT 

1 

.265 

(1.54) 

.22 

(1.77) 

.027 
(•  82) 

.181 

(1.22) 

-.03 

( -  8  7 1 

2 

\V.U 

.  154 
(1-81) 

.078 
(-  83) 

.  141 
(1.2) 

-.03 

(.871 

3 

'(1.°55) 

.  199 
(1.56) 

.025 

(.84) 

.  133 
(1.22) 

-.031 

(.84) 

4 

.349 

(1-43) 

.  163 
(1-72) 

.037 

(.81) 

.115 

(1.51) 

-.027 
( .  83t 

5 

.213 

(1-3) 

.  134 
(1.76) 

.031 
(-  81) 

.  041 
(1.39) 

-.023 
{ .  84) 

6 

.259 

(1-3  1) 

.229 

(1.76) 

.006 
(.  79) 

.  022 
(1.55) 

-.032 

(.79) 

7 

.272 

(1-25) 

.253 

(1.75) 

.  002 
(-  8) 

.  003 

(1.6) 

-.033 

(.811 

8 

.2  1 
(1.16) 

.224 

(1.6) 

.  004 
(-79) 

-.009 

(1.65) 

-.031 

(.741 

9 

.148 

(1.16) 

.183 

(1.58) 

-.  023 
(.81) 

.  024 
(2.  08) 

-.027 

(.73) 

10 

.171 

(1.14) 

.  193 

(1.69) 

-.  023 
(.79) 

.  024 
(2.08) 

-.027 

(.751 

11 

.098 

(1.3) 

.115 

(1.97) 

-.  013 
(-74) 

-.011 

(2.  08) 

-.043 

(.75) 

12 

.203 
(1 .66) 

.130 

(2.24) 

-.  045 
(-71) 

.  05  1 
(2.  09) 

-.042 

(.8) 

models.  This  procedure  should  yield  forecasts  with  a  better 
fit  than  was  possible  with  only  the  univariate  or  regression 
models.  This  procedure  closely  parallels  the  work  of  Darling 
[Ref.  2],  in  modeling  the  supply  of  recruits  foe  the  Marine 
Corp.  The  combined  model  will  be  developed  in  the  next 
section. 
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Boiler  Technician  model.  These  values  represeat  the  nest 
F-sguared  that  were  obtained  at  all  lagged  values  of  the 
independent  variable  and  not  just  the  ones  that  were  indi¬ 
cated  as  being  significant  in  the  leading  indicator  model, 
t-ratios  were  somewhat  stronger  ranging  f ro m  .1  for 
Storekeepers  to  11.3  for  Electronics  Technicians  with  three 
of  the  five  ratings  havirg  values  above  the  minimum  accept¬ 
ance  level  of  2.0.  The  Dur bin-W atson  statistic  was  weak  for 
the  models  as  well  with  only  the  Storekeeper  model  clearly 
within  the  acceptable  range  of  1.5  to  2.5  for  the  data. 
This  indicates  that  the  regression  failed  to  rs  move  all  of 
the  serial  correlation  present  in  the  data  sots  and  the 
residuals  are  positively  correlated  to  one  another. 

While  these  models  are  disappointing,  ti ey  are  not 
discouraging.  They  seem  to  indicate  that,  when  taken  alone, 
unemployment  does  not  possess  the  strong  predictive  ability 
it  seems  to  have  in  other  econometric  models.  3epko 
[Ref.  1],  and  Darling  [Ref.  2],  in  their  regression  models 
attribute  nearly  50  percent  of  the  explained  sub  of  sguared 
error  to  unemployment,  this  may  indicate  that  this  relation¬ 
ship  may  not  hold  in  modeling  the  behavior  of  first  term 
reenlistments  for  military  personnel.  It  shouL d  be  noted 
that  Bepko  [Ref.  1],  constructed  an  aggregated  model  of 
careerists  using  the  25-39  age  group  for  unemployment  and 
Darling  [Ref.  2],  utilized  national  teenage  une n pioyment  in 
modeling  Narine  Corp  enlistments. 

The  predictions  for  the  regression  model  are  compared  to 
the  observed  levels  of  reenlistment  for  the  period  October 
1983  to  March  1  984,  these  forecasted  levels  ace  shown  in 
table  VI  In  addition,  the  regression  forecast  foe  the  entire 
data  set  from  October  1980  to  September  1983  wiL 1  be  gener¬ 
ated  and  the  residuals  for  that  data  set  will  be  indepen¬ 
dently  modeled  as  a  new  time  series  with  a  forecast  of 
residuals  to  be  applied  to  the  forecasts  from  the  regression 
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D.  LINEAR  REGRESSION 


In  order  to  verify  the  leading  indicator  models,  a 
linear  regression  will  be  constructed  using  20-24  year  old 
unemployment  as  the  independent  variable  and  reenlistaent 
rates  for  the  selected  ratings  as  the  dependea t  variacle. 
For  this  process  to  be  valid,  certain  assunptions  are 
required  prior  to  application  of  the  process.  The  first 
assumption  is  that  there  is  a  linear  relationship  between 
the  variables  as  described  by  eqn.  3.  1 


Y  =  a  +  BX,  ♦  e 


(eqn  3.1) 


where  for  each  observation,  Y  is  a  random  variable.  The 
second  assumption  is  that  X  is  fixed  in  value  and  the  final 
assumption  is  that  e,  the  error  term,  has  an  expected  value 
of  zero  with  constant  variance  for  all  observations.  It  is 
further  assumed  that  the  e's  are  normally  distributed  and 
uncorrelated.  [Ref.  6] 

E.  MODEL  VERIFICATION 

Linear  regression  models  were  run  for  all  of  tie  data  sets 
using  the  lagged  value  of  the  unemployment  data  as  the  inde¬ 
pendent  variable  and  the  reenlistment  data  as  tie  dependent 
variable.  The  regressions  were  evaluated  using  tie  R-squared 
value,  Durbin-Watson  statistic  and  t-ratio  A  sub  mary  of  the 
regression  models  R-squared  and  Durbin  Watson  statistic  is 
presented  in  Table  IV. 

The  regression  models  for  all  of  the  ratings  were  less 
than  robust  in  their  ability  to  verify  the  significance  of 
the  leading  indicator  models.  R-squared  values  ringed  from  a 
high  of  .34  for  the  Yeoman  model  to  a  dismal  -.  02  for  the 
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5.  Boiler  Technicians 

Consistent  with  the  previous  models,  tie  model  for 
Boiler  Technicians  also  presented  more  than  one  prominent 
point  for  evaluation.  These  points  occurred  at  the  nine  and 
twelve  month  points.  Fig.  3.9  is  the  cross-correlation  func¬ 
tion  for  the  lead  values  of  the  model. 
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Figure  3.9  CROSS  CORRELATION  FOR  BT  RATING. 
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4.  Electronics  Technicians 

The  cr oss-co rrelati on  function  for  this  model  also 
suggested  more  than  one  lead  point  for  investiga: ion ,  these 
occurred  at  the  six  and  twelve  month  points  but  again  were 
significant  only  in  relative  terms  and  not  in  absolute 
magnitude.  Fig.  3.8  illustrates  the  lead  relationship  in  the 
cross-correlation  function. 
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Figure  3.8  CROSS  CORRELATION  FOR  ET  RATING. 


3-  Operations  Specialists 

The  cross-correlati on  function  again  inlicated  more 
than  one  point  for  possible  investigation  as  being  rela¬ 
tively  significant.  These  points  occurred  at  ths  six,  nine 
and  twelve  month  points.  Fig.  3.7  again  sho<*  s  the  lead 
values  for  this  model. 
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Storekeepers 


The  cross-correlation  function  for  the  storekeeper 
model  indicated  a  possible  lead  indicator  relationship  at 
the  five  month  and  eleven  month  points.  Both  of  these  points 
were  significant  in  their  relationship  to  the  ether  values 
but  again  were  below  the  accepted  level  for  determining 
significance.  It  was  decided  to  investigate  the  significance 
of  these  points  in  the  regression  procedure.  Fig.  3.6  shows 
the  cross-correlation  function  for  positive  leads  of  this 
model . 
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Figure  3.6  CBOSS  CORRELATION  FOR  SK  RATING. 
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APPLICATION  TO  THE  SELECTED  RATINGS  DATA  SETS 


1  -  Ye  o  aan 

When  the  ARII1A  (0,1,1)  model  was  applied  to  both  the 
reenlistment  data  for  yeoman  and  the  differenced  unemploy¬ 
ment  data  set  and  the  cros s-correlati on  functioa  evaluated, 
there  appeared  to  be  a  relationship  at  the  twelve  month  lead 
point.  The  value  at  this  point  was  significant  wien  compared 
to  the  other  points  however,  it  was  not  significant  in  abso¬ 
lute  terras  since  it  was  not  in  excess  of  two  standard  errors 
of  the  mean  zero.  This  observation  of  magnitude  holds  true 
in  all  of  the  models.  Fig.  3.5  shows  the  cross- cor  relation 
function  for  this  data  set  for  all  lead  months  only. 
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TABLE  III 

SUMMARY  OF  UNEMPLOYMENT  MA  1  MODEL 


NUMBER  TYPE  ESTIMATE  ST.  DEV.  I-RATIO 

1  MA  1  0.9  850  0.0689  14.29 

DIFFERENCING .  1  REGULAR 

RESIDUALS.  SS  =  896.9  (BACKFOREC ASTS  EXCLUDED) 

DF  =  34  MS  =  2  6.4 

NO.  OF  CBS.  ORIGINAL  SERIES  36  AFTER  DIFFERENCING  35 
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Figure  3.3  PARTIAL  AUTOCORRELATIONS  FOR  UNEMPLOYMENT  DATA. 


point,  the  autocorrelation  and  partial  autocorrelation 
suggested  that  a  moving  average  model  was  appropriate.  This 
model  of  the  ARIMA  (0,1,1)  type  met  all  of  tie  selection 
criteria  necessarry  and  was  therefore  adopted  foe  use.  Model 
results  and  specifications  are  presented  in  Fig.  3.1  through 
Fig.  3.4. 


Figure  3.1  TIME  SEBIES  PLOT  FOB  JNEMPLO YMES T  DATA. 


1  •  Methodology 


The  unemployment  time  series  model  was  constructed 
utilizing  the  same  methodology  as  applied  in  tie  preceding 
chapter  to  reenlistment  time  series  for  the  selected  rating 
models.  The  unemployment  data  was  initially  transformed  by 
computing  the  relative  percentage  change  from  oa e  period  to 
the  next.  This  was  done  by  subtracting  the  rate  in  the 
current  period  from  the  rate  in  the  preceding  period  and 
dividing  the  remainder  by  the  rate  of  the  preceding  period. 
In  doing  this,  it  was  felt  that  the  resulting  model  would 
better  capture  responses  to  changes  in  the  unemployment  rate 
rather  than  responses  to  the  overall  level.  It  was  further 
hypothesized  that  this  would  capture  any  perceptions  by  the 
service  member  that  the  job  market  was  improving  or  wors¬ 
ening  in  relation  to  the  demand  for  a  particular  skill 
[Ref.  5]. 

The  computed  change  in  unemployment  tima  series  was 
then  evaluated  for  a  potential  model  by  screening  the  auto¬ 
correlation  and  partial  autocorrelation  functions.  The  data 
appeared  stationary  but  did  not  suggest  any  obvious  model 
for  selection.  As  a  result  a  trial  and  error  metnod  was  used 
for  model  selection.  The  model  iterations  were  evaluated  for 
suitability  utilizing  the  same  criteria  described  in  the 
previous  chapter,  that  is  evaluation  of  residuals  for 
randomness,  smallest  sum  of  sguared  errors  and  a  t-ratio  in 
excess  of  2.3.  Several  trials  of  autoregressive  models, 
moving  average  models  and  mixed  autoregressive  moving 
average  models  did  not  yield  any  positi/e  results. 
Therefore,  it  was  decided  to  difference  the  data  one  time 
and  try  the  iterative  model  building  process  again. 

The  resulting  differenced  data  set  also  met  the 
stationarity  criteria  and  did  not  exhibit  any  characteris¬ 
tics  of  an  overdifferenced  data  set.  When  evaluated  at  this 
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III.  THE  LEADING  INDICATOR .  REGRESSION  AND  COMBINED  MODELS 

A.  OVERVIEW 

A  leading  indicator  model  for  reenlistmsnt  in  the 
selected  ratings  was  constructed  by  first  developing  a 
univariate  time  series  model  for  unemployment  in  the  20-24 
year  old  age  group  and  then  applying  this  model  to  the  lata 
for  the  selected  ratings.  The  resultant  model  residuals  were 
then  cr osscorrelated  to  establish  the  location  of  any  time 
leads  or  lags  that  affect  reenlistments.  These  indicators 
could  provide  an  early  warning  system  of  shifts  Ln  the  level 
of  reenlistment  and/or  the  direction  of  the  trend  in  reen¬ 
listment.  Once  developed,  the  adequacy  of  the  model  was 
tested  by  using  the  coefficient  of  determination  (R-sguared) 
from  the  indicated  lag/lead.  Forecasts  for  the  0:  t  83-Mar  84 
time  period  were  also  generated  in  this  process  and  the 
results  compared  to  the  univariate  models  forecasts.  A 
combined  model  using  time  series  analysis  and  regression 
were  alsc  formulated. 

B.  THE  UNEMPLOYMENT  MODEL 

The  data  for  unemployment  in  the  20-24  age  bracket  was 
collected  from  monthly  publications  by  the  3ureiu  of  Labor 
Statistics.  These  figures  cover  the  period  from  October  1980 
through  September  19  83  and  are  presented  in  Appendix  A.  This 
particular  age  grouping  was  selected  as  being  the  most 
appropriate  for  personnel  completing  their  first  enlistment 
and  facing  the  reenlistment  decision. 


a  summary  of  the  forecasts  for  all  three  nethods  and 
percentage  of  error  for  each  forecast. 

The  combined  model  resulted  in  improved  forecasts  in 
three  of  the  five  ratings  with  the  other  two  showing  either 
minor  improvement  or  a  slight  decline  (BT/  OS).  It  should  be 
noted  at  this  point  that  these  two  data  sets  were  the  most 
volatile  in  terms  of  range  of  observations.  To e  forecasts 
for  these  ratings  could  be  improved  by  eliminating  signifi¬ 
cant  outliers  from  the  data  sets  and  recomputing  all  of  the 
models.  Due  to  the  already  small  size  of  the  dati  sets,  this 
was  not  considered.  This  problem  should  be  corrected  in 
future  works  in  this  area  as  more  data  points  become  avail¬ 
able  for  analysis. 


FORECAST  COMPARISONS  OF  ALL  THREE  MODELS 
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17.  SDMMASX  AND  CONCLUSIONS 


A.  SUMMARY 

Several  forecasting  techniques  have  been  examined  in 
this  thesis  in  an  attempt  to  predict  the  pattern  of  reen¬ 
listments  in  five  specific  ratings.  Two  distinct  methods 
were  used  to  build  three  models;  a  univariate  Sox-Jenkins 
model,  a  linear  regression  model  and  a  combined  regression 
and  Box-Jenkins  model. 

The  results  of  each  varied  in  predictive  ability,  with 
the  combined  model  being  clearly  superior  to  the  other  two, 
(as  measured  by  percent  error  of  the  actual  observations), 
but  with  the  results  by  rating  differed  sharply  within  each 
model.  For  electronics  technicians  all  three  models  were 
clearly  adequate,  this  is  not  surprising  since  this  rating 
had  the  smallest  range  and  the  least  variance  in  reenlist¬ 
ments  during  the  time  period  examined.  The  regression  equa¬ 
tion  for  ET’s  yielded  a  very  low  H-squared  value  of  the 
model.  Appropriate  additional  explanatory  variables  may  be 
the  level  of  reenlistment  bonuses  or  the  availability  of 
advanced  technical  training.  Boiler  technicians  and  opera¬ 
tions  specialists  showed  the  widest  range  in  cee nlistment 
percentages,  and,  as  expected,  their  models  ethibited  the 
least  accuracy.  The  regression  equations  for  tnese  ratings 
were  counterintuitive  in  that  they  indicated  higher  reen¬ 
listment  rates  at  successively  lower  levels  of  the  indepen¬ 
dent  variable,  unemployment.  This  indicates  that  an 
additional  independent  variable  may  be  required  in  the 
equation  for  these  ratings.  For  BT's  this  may  be  a  dummy 
variable  accounting  for  the  unpleasantness  of  the  working 
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conditions  or  the  level  of  their  reenlistment  banuses.  For 
the  OS  rating,  it  may  also  be  the  reenlistment  bonus  level 
or  a  factor  accounting  for  the  high  amount  of  sea  duty 
present  in  that  rating  when  compared  to  others.  The  ratings 
of  yeoman  and  storekeeper  presented  models  that  were 
marginal  when  using  Eox-Jenkins  or  regression  separately  but 
presented  guite  good  predictions  when  utilizing  the  combined 
model . 

A  somewhat  surprising  result  of  the  models  forecasts  was 
the  accuracy  of  the  regression  model  using  20-24  year  old 
unemployment  as  the  only  independent  variable.  This  is 
surprising  in  view  of  the  low  R-sguared  values  of  the  models 
and  the  high  degree  of  serial  correllation  remaining  in  the 
residuals  as  expressed  by  the  Durbin-Watson  statistic.  This 
was  actually  the  second  most  accurate  prediction  model 
outperforming  the  univariate  Box-Jenkins  model  jy  a  slight 
margin. 

The  Box-Jenkins  models'  performance  was  restricted  by 
the  size  of  the  data  set  available.  Technically,  thirty  or 
more  observations  in  a  data  set  are  considered  sufficient 
but  100  or  more  observations  are  considered  desirable  in 
order  to  utilize  the  full  predictive  power  of  the  model. 
This  larger  number  of  observations  is  also  considered  desir¬ 
able  in  terms  of  identifying  the  underlying  trends  and 
patterns  which  may  net  appear  in  a  smaller  set  )f  data.  In 
terms  of  forecasting  reenlistments,  it  is  not  possible  at 
this  point  to  utilize  any  more  data  points  than  were  avail¬ 
able  for  this  study  since  the  monthly  figures  ara  aggregated 
by  quarters  after  three  years  and  only  the  quarterly  data 
are  retained. 


B.  CONCLUSIONS 

A  surprising  finding,  for  all  of  the  ratings  modi  led,  is  the 
continued  rise  in  reenli stments  in  view  of  the  ever 
improving  economy  during  the  period.  This  could  possibly  be 
explained,  in  a  regression  model,  with  the  introduction  of 
the  civilian/military  pay  ratio  for  the  period  or  the  level 
of  reenlistmeat  bonuses  for  a  rating.  This  would  still, 
however,  not  account  for  the  95  percent  reenli.  stment  rate 
for  electronics  technicians  who  are  generally  regarded  as 
having  the  most  desireable  and  marketable  skills  in  almost 
any  employment  market.  Another  explanatory  term  could  be 
introduced  for  "taste"  for  military  service  much  as  the  ACCL 
model  uses.  In  light  of  the  world  situation  and  recent 
events  in  Lebanon,  Granada  and  the  Persian  Gulf  this  could 
be  a  significant  explanatory  factor  for  the  continued  rise 
in  reenlistments. 

In  terms  of  policy  implications,  the  results  for  all  of 
the  models  utilized  indicate  that  high  levels  of  first  term 
retention  are  likely  to  continue  in  all  of  these  ratings  for 
the  next  six  to  twelve  months.  At  this  point,  decisions  will 
be  reguired  on  how  to  deal  with  these  increases  in  a  service 
that  is  rapidly  approaching  authorized  end  strength.  The 
longer  range  forecasts  still  seem  to  indicate  that  the 
currently  favorable  climate  will  eventually  give  way  to  an 
ever  improving  economy.  Now  would  seem  to  be  the  most  oppor¬ 
tune  time  to  take  advantage  of  the  situation  bf  increasing 
the  total  number  of  personnel  in  the  career  force  as  a  hedge 
against  the  future  change  in  the  demographics  of  the  cohort 
eligible  for  military  service.  This  will  induce  a  short 
term  increase  in  compensation  costs  by  increasing  the  inven¬ 
tory  of  career  petty  officers.  This  will  eventually  be 
offset  by  the  reduction  in  future  training  and  recruiting 
costs  that  will  result  from  this  larger  career  farce. 


C.  SUGGESTIONS  FOR  FUTURE  RESEARCH 


As  previously  stated,  the  full  potential  of  the 
Box-Jenkins  method  has  not  been  fully  exploited  because  of 
restrictions  in  the  amount  of  data  available.  Farther,  this 
restriction  can  only  be  corrected  with  the  passage  of  time 
as  more  observations  become  available.  The  modeLs  presented 
in  this  paper  were  rating-specific,  which  may  only  have 
limited  application.  In  a  broad  sense,  however,  research 
should  continue  along  these  lines  with  aggregate  models  of 
rating  groups.  As  to  how  this  aggregation  should  be 
performed,  Thomas  and  Liao  [Ref.  3],  have  suggested  grouping 
ratings  by  observed  reenlistment  behavior  in  their  model  of 
second  and  subsequent  term  careerists.  This  grouping  should 
be  conducive  to  application  of  Box-Jenkins  techniques  which 
appears  to  be  more  effective  when  dealing  with  a  data  set  of 
narrow  range.  Another  possibility  for  grouping  rould  follow 
the  level  of  skill  required  as  indicated  by  a  rating  being 
termed  high-tech,  medium  tech  or  low  tech  [Ref.  1  ]. 

The  combined  regression,  Box-Jenkins  model  presented 
here  also  deserves  future  consideration  as  it  appears  to  be 
a  viable  "fine  tuning"  method  for  regression  models  and 
intuitively  more  appealing  than  introducing  more  and  more 
variables  into  the  analysis.  Use  of  a  combined  modeling 
technique  can  only  serve  to  strengthen  the  results  of 
regression  models  that  are  currently  very  popular. 

The  Box-Jenkins  method  is  not  meant  to  be  ai  all  encom¬ 
passing  method  for  use  in  manpower  modeling.  [t  certainly 
should,  however,  be  considered  as  a  tool  to  be  placed  in  the 
arsenal  of  the  manpower  planner  for  continued  usa  and  devel¬ 
opment.  In  view  of  the  many  commercial  software  packages 
available  for  this  technique,  implementation  and  application 
to  manpower  issues  should  most  strongly  be  coisidered  for 
use  in  Navy  manpower  planning. 


APPENDIX  A 

SUMMARY  OF  DATA  USED  III  ANALYSIS 


TABLE  YII 

DATA  UTILIZED  FOR  ANALYSIS 
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APPENDIX  B 

THE  BOX-JENKINS  METHOD 


A.  OVERVIEW 

The  Box-Jsnkins  procedure  can  be  used  to  fit  and  fore¬ 
cast  time  series  data  by  means  of  a  general  class  of  statis¬ 
tical  models.  An  observation  at  a  given  point  in  time  is 
modeled  as  a  function  of  its  past  values  and/or  current  and 
past  values  of  the  random  errors,  both  at  seasonal  and  non- 
seasonal  lags.  Box-Jenkins  methodology  will  modeL  a  variable 
with  observations  equally  spaced  in  time  and  no  missing 
values.  Sometimes  it  may  be  necessarry,  before  aodeling  the 
series,  to  transform  the  data  by  taking  the  log  function, 
square  root,  power  of  the  series  or  to  difference  the  series 
on  a  seasonal  or  non-seasonal  basis. 

The  modeling  of  time  series  data  is  usuaLly  done  in 
three  steps.  First,  identify  a  tentative  moiel  for  the 
series.  Second,  estimate  the  parameters3  and  examine  the 
diagnostic  plots  and  statistics.  Third,  if  tie  model  is 
deemed  acceptable,  utilize  the  procedure  for  forecasting.  If 
the  model  is  inadequate,  return  to  step  one  and  evaluate  the 
time  series  for  more  appropriate  models  until  ai  acceptable 
one  is  found.  Fig.  B. 1  illustrates  the  steps  required  in 
Eox-Jenkins  analysis. 

The  advantage  gained  in  using  Box-Jenkins  analysis  is 
that  it  allows  the  data  to  speak  for  itself  siice  it  is  a 
univariate  procedure  and  therefore  does  not  alio#  for  expla¬ 
natory  variables.  The  underlying  more  restrictive  assumption 


3for  most  spftw^re  packages,  applied  to  3ox-Jenkins 
modeling,  the  estimation  or  parameters  is  done  ai  tomatically 
in  the  program  leaving  the  researcher  free  to  concentrate  on 
analysis  of  the  resulting  statistics. 


throughout  ail  Box-Jenkins  procedures  is  that  the  tine 
series  will  eventually  repeat  itself  [Ref.  7],  or  that  there 
is  some  pattern  underlying  the  data. 


B.  WHAT  IS  A  TIME  SERIES 

A  time  series  is  a  collection  of  observation 
sequentially  over  time  at  specific  intervals  sue 
days,  weeks,  months  or  years.  In  addition,  a  ce: 
dence  is  supposed  from  one  period  to  the  next, 
interdependence  that  is  of  value  when  trying  i 
future  activity  for  a  time  series.  Examples  of 
abound  in  fields  ranging  from  business  to  physi. 
applied  to  analyze  monthly  sales  for  a  company, 
yields  on  fiduciary  notes  or  the  chemical  yield  n 
substance  in  a  controlled  procedure.  A  time  seri 
be  used  to  analyze  observations  that  are  either 
continuous;  by  way  of  example,  a  discrete  time  3 
be  the  closing  stock  price  of  a  company  and  a 
time  series  would  be  the  temperature  at  the  weat 
In  summary,  a  discrete  time  series  is  one  where  o 
exist  at  a  point  in  time  and  a  continuous  time 
potential  observations  at  all  points  in  time.  ? 
of  this  discussion,  only  discrete  time  seri 
addressed. 
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C.  STATIONARITY 

In  pursuing  a  time  series  model,  the  first  assumption  to 
be  made  in  the  analysis  is  that  the  data  set  is  stationary. 
By  this  it  is  meant  that  the  observations  oscillate  around  a 
constant  mean  that  shows  no  growth  over  time.  Deviations 
about  this  mean  are  temporary  and  in  the  long  run  display 
equilibrium  about  the  mean  [Ref.  4], 
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A  further  measure  of  stationarity  can  be  gaiied  from  the 
autocorrelation  of  the  time  series,  that  is  the  correlation 
between  successive  observations  from  the  same  di  ta  set.  An 
observation  at  time  t,  denoted  by  Zt,  when  correlated  with 
an  observation  Zt+1  from  the  same  data  set  is  said  to 
produce  an  autocorrelation.  The  autocorrelation  is  measured 
by  pk  and  provides  important  information  about  tie  nature  of 
the  data  set.  A  value  close  to  + 1  indicates  a  high  degree  of 
positive  correlation  between  observations,  while  a  value 
close  to  -1  indicates  a  high  negative  correlation 

Most  time  series  are  not  stationary  and  require  some 
type  of  transformation  prior  to  analysis. 

D.  TIME  SERIES  PLOT 

The  first  step  in  determining  whether  a  tine  series  is 
stationary  or  not  is  to  construct  a  time  series  plot  of  the 
data  which  plots  observations  against  time  in  an  attempt  to 
visually  determine  any  obvious  patterns  in  the  data.  Fig. 
B.2  illustrates  the  United  States  gross  national  product  for 
the  years  1947  through  1970  on  a  quarterly  basis.  This  plot 
shows  a  clear  upward  trend  in  the  data  which  indicates  the 
data  is  not  stationary,  further  within  each  year  there  are 
apparently  recurring  patterns  for  each  quarter  that  repeat 
on  annual  basis.  Finally,  as  time  passes,  the  variance  in 
GNP  tends  to  become  larger  and  more  volatile.  CLearly,  this 
data  set  must  be  transformed  prior  to  further  analysis  by 
the  Box-Jenkins  method. 

E.  DATA  TRANSFORMATION 

To  continue  with  the  example  of  GNP,  there  are  several 
possible  transformations  that  can  be  used  to  ini uce  statio¬ 
narity.  The  first  step  is  to  induce  a  constant  variance  in 
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Figure  B.2 


OHITED  STATES  GNP  1947  -  1970 


the  data,  this  can  be  accomplished  either  through  a 
logarithmic  or  a  square  root  transformation.  Fig.  3.3  illus¬ 
trates  the  results  of  a  square  root  transform  of  the  data. 
The  trend  is  still  clearly  present  but  the  variance  has  been 
smoothed  considerably. 

'  Once  variance  has  been  stabilized,  the  next  step  is  to 
remove  the  trend.  There  are  several  sophisticated  regres¬ 
sion  techniques  available  to  accomplish  this  however,  the 
method  of  differencing  will  be  the  only  one  addressed  here. 
For  a  more  detailed  discussion  of  these  alternative  techni¬ 
ques,  the  reader  is  directed  to  Makridakis  and  Wheelright 
[Ref.  4]. 

The  method  of  differencing  a  time  series  ;  onsists  of 
subtracting  the  values  of  the  time  series  from  each  other  in 
a  specified  order.  By  way  of  example,  consider  tie  following 


F. 


AUTOCORRELATION  AND  PARTIAL  AUTOCORRELATION 


A  useful  tool  in  model  estimation  is  the  autocorrelation 
function  (ACF)  ,  which  can  be  defined  as  the  association  or 
mutual  dependence  between  values  of  the  same  variable  but  at 
different  time  periods.  These  ACF  coefficients  provide 
valuable  information  about  a  data  set  and  any  pattern  that 
may  be  present.  If,  for  example,  a  high  positive  coefficient 
appeared  every  twelve  months,  a  seasonal  trend  may  be 
considered  to  exist. 

The  partial  autocorrelation  function,  (PACF)  ,  is  another 
and  complimentary  measure  to  be  applied  along  with  the  ACF 
to  aid  in  determining  model  type.  PACF's  are  analogous  to 
ACF  *  s  in  that  they  indicate  the  relationship  of  the  values 
of  a  time  series  to  various  time  lagged  values  of  the  same 
series.  They  differ  from  ACF's,  however,  in  ti  at  they  are 
computed  for  each  time  lag  after  removing  the  effect  of  all 
other  time  lags.  In  essence,  they  show  the  relative  strength 
of  the  relationship  that  exists  for  varying  time  lags. 

when  the  ACF  and  the  PACF  are  analyzed  together,  they 
provide  a  very  powerful  tool  for  initial  model,  selection. 
Fig.  B. 4  through  Fig.  B.6  summarize  the  general  shapes  asso¬ 
ciated  with  the  different  types  of  models. 

G.  THE  AUTOREGRESSIVE  MODEL 

A  time  secies  is  said  to  be  governed  by  an  autoregres¬ 
sive,  (AR)  ,  if  the  current  value  of  the  time  series  can  be 
expressed  as  a  linear  function  of  the  previous  value  or 
values  plus  some  error  term  or  random  shock  vali e  [Ref.  8], 
The  assumptions  made  here  are  that  the  data  set  is 
stationary  and  the  error  terms  are  normally  and  indepen¬ 
dently  distributed  with  a  mean  of  zero  and  constant  vari¬ 
ance.  A  check  on  the  adequacy  of  the  model  is  t o  construct 
an  ACF  for  the  residuals  of  the  model  and  deters ine  if  they 
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are  random  in  nature.  Mathematically,  an  AR  (1)  model  is  of 
the  form: 


=  $  z^_j_  ♦  at  (egn  B.1) 

where  <J)  is  equal  to  the  autoregressive  coefficient  and  a  is 
the  random  error  or  shock  term. 

H.  THE  MOVING  AVERAGE  MODEL 

A  time  secies  is  said  to  be  governed  by  a  moving  average 
process  if  the  current  value  of  the  time  secies  can  be 
expressed  as  a  linear  function  of  the  current  ercor  term  and 
previous  error  term(s).  The  same  restrictions  apply  to  the 
error  terms  of  am  MA  model  as  applied  to  the  AR  model. 
Mathematically,  this  function  can  be  expressed  as; 


Z £  =  a.^_^  —  r  (eqn  B . 2 ) 

where  e  is  the  moving  average  coefficient  and  a  Ls  again  the 
error  term. 

I.  THE  MIXED  AUTOREGRESSIVE  MOVING  AVERAGE  MODEL 

The  mixed  autoregressive  moving  average  model  contains 
elements  of  both  the  AR  and  the  MA  procedures  and  expresses 
the  relationship  of  a  current  observation  as  a  Linear  func¬ 
tion  of  both  past  values  and  past  errors  of  the  variable.  As 
with  the  AR  and  MA  models,  the  residuals  of  the  model  are 
evaluated  for  adequacy  by  utilizing  the  ACF  function.  The 
equation  for  the  ARMA  model  is: 


Figure  B.l*  TYPICAL  FORM  OF  AR 1  MODEL  ACF  AS  D  PACF. 


Zt  =  ^  1  ^  —  9 a ^  (  (eqn  B.3) 

J.  EVALUATING  THE  MODEL 

Once  a  model  has  been  selected,  there  are  several  ways 
to  check  the  results  for  adequacy.  For  purposes  of  this 
discussion,  the  following  checks  will  be  addressed: 

1.  ACF  of  residuals 

2.  Minimum  sum  of  squares 

3.  t-ratio 

There  are  several  other  checks  for  adequacy  that  are  avail¬ 
able  to  the  user  of  Box-Jenkins  methodology,  for  a  more 
comprehensive  explanation  the  reader  is  directed  to  Vandaele 

[Ref.  8]. 


Figure  B.5  TYPICAL  FORM  OF  MAI  MODEL  ACF  AS D  PACF. 

1 «  ACF  of  The  Residuals 

As  mentioned  throughout  this  discussion,  the  ACF  for 
a  model  should  be  random  about  the  mean  zero  with  constant 
variance  and  a  magnitude  less  than  two  standard  errors. 

2.  Minimum  Sum  of  Squares 

Determination  of  this  measure  can  only  be  achieved 
by  comparison  with  other  potential  models.  In  some  cases, 
several  models  may  produce  insignificantly  different  sums  of 
squares  which  will  test  the  user's  judgment  and  application 
of  the  other  measures  of  adequacy. 

3.  t-Ratio 


In  time  series  analysis  this  is  computed  by  dividing 
the  estimate  of  the  parameter  for  the  model  by  the  standard 


deviation  for  the  series.  The  rules  as  appliel  to  regres¬ 
sion  analysis  still  held  in  that  the  value  shouli  be  greater 
than  +/-2.0  in  order  to  indicate  that  the  coefficient  is 
significantly  different  from  zero. 


K.  PARSIMONY 

In  the  event  that  aore  than  one  model  is  capable  of 
satisfying  the  acceptance  criteria  described  lbove,  the 
principle  of  parsimony  will  then  apply.  This  states  that 
when  faced  with  several  sufficient  model  types,  select  the 
the  lowest  order  model  available  that  satisfies  the 

criteria. 

L.  TRANSFER  FUNCTION  MODELS 

Also  known  as  multivariate  autoregressive  integrated 
moving  average,  (ilARIMA)  ,  or  leading  indicator  m3  dels.  This 
involves  selection  of  an  appropriate  univariate  model  for 
what  is  to  be  the  independent  variable  and  appLying  it  to 
the  dependent  variable.  The  application  of  the  model  will 
result  in  two  sets  of  residuals  which  when  cross  correlated 
at  different  time  lags  will  yield  the  cross  correlation 
function,  (CCF).  This  differs  slightly  from  the  ACF 

discussed  earlier  in  that  we  now  expect  to  find  a  relation¬ 
ship  of  significant  magnitude  at  various  poii ts  in  the 
comparison.  This  positive  correlation  is  an  indication  that 
there  is  a  significant  relationship  between  the  independent 
and  the  dependent  variable  at  certain  lags  or  lei ds  in  time. 


Figure  B.6  TYPICAL  FORH  OF  ARMA  1,1  MODEL  ACF  AND  PACF. 
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APPENDIX  C 

SUMMARY  OF  BOX-JENKINS  MODELS  DSED  FOR  ANALYSIS 


MODELS  USED  IN  UNIVARIATE  ANALYSIS 

1 .  Yeoman 

Model  Type  -  ARIMA  (1,0,0) 
Model  Equation 

Z  =  .  44  8zt.-(  ♦  ax 

T-ratio  -  2.64 
Model  Residuals  -  Random 
2  •  Storekeepers 

Model  Type  -  ARIMA  (0,1,1) 
Model  Equation 


Z  =  -,8  14a.,...,  +  ar 

T-ratio  -  7.24 


(eqn  C.1) 


(eqn  C . 2) 


Model  Residuals 


Random 


Operations  Specialists 

Model  Type  -  ARIMA  (1,0,0) 

Model  Equation 

Z  =  .499Zt_,  ♦  ar  (eqn  C.3) 

T-ratio  -  3.39 
Model  Residuals  -  Random 
Electronics  Technicians 

Model  Type  -  ARIMA  (1,0,0) 

Model  Equation 

Z  =  .  588Zr.(  «•  ac  (eqn  C.4) 

T-ratio  -  4.18 


Model  Residuals  -  Random 


5 


•  Boiler  Technicians 

Model  Type  -  ARIMA  (1,0,0) 

Model  Equation 

Z  =  .54  1ZX_,  ♦  az 

T-ratio  -  3.63 
Model  Residuals  -  Random 

B.  BOX-JENKINS  MODELS  OF  REGRESSION  RESIDDALS 

1 .  Yeoman 

Model  Type  -  ARIMA  (0,0,1) 

Model  Equation 

Z  =  -,376at>|  +  ar 

T-ratio  -  -2.  19 


(eqn  C.5) 


(eqn  C . 6 ) 


Model  Residuals  -  Random 


Storek.ee  pers 


Model  Type  -  ARIMA  (1,1,0) 

Model  Equation 

Z  =  .498Zt_,  +ar 

T-ratio  -  -3.00 
Model  Residuals  -  Random 
Operations  Specialists 

Modal  Type  -  ARIMA  (1,1,1) 

Model  Equation 

Z  =  .604Zt_,  +  a^  -  .9561^ 

T-ratio  -  AEl  -  3.28 
MAI  -  9.75 


(eqn  C.7) 


(eqn  C . 8) 


Model  Residuals  -  Random 


4 .  Electronics  Technic ians 

Model  Type  -  ARIMA  (1,0,1) 

Model  Equation 

Z  =  .854Zt.,  +  a  529a^_(  (eqn  C.9) 

T-ratio  -  AR 1  -  5.21 
MAI  -  2.04 

Model  Residuals  -  Random 
5 •  Boiler  Technicians 

Model  Type  -  ARIMA  (1,0,0) 

Model  Equation 

Z  =  .590Zx_t  ♦a.p  (egn  C.  10) 

T-ratio  -  3.70 

Model  Residuals  -  Random 
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